NSSI questionnaires revisited: A data mining approach to shorten the NSSI questionnaires

Background and objective Non-suicidal self-injury (NSSI) is a psychological disorder that the sufferer consciously damages their body tissues, often too severe that requires intensive care medicine. As some individuals hide their NSSI behaviors, other people can only identify them if they catch them while injuring, or via dedicated questionnaires. However, questionnaires are long and tedious to answer, thus the answers might be inconsistent. Hence, in this study for the first time, we abstracted a larger questionnaire (of 662 items in total) to own only 22 items (questions) via data mining techniques. Then, we trained several machine learning algorithms to classify individuals based on their answers into two classes. Methods Data from 277 previously-questioned participants is used in several data mining methods to select features (questions) that highly represent NSSI, then 245 different people were asked to participate in an online test to validate those features via machine learning methods. Results The highest accuracy and F1 score of the selected features–via the Genetics algorithm–are 80.0% and 74.8% respectively for a Random Forest algorithm. Cronbach’s alpha of the online test (validation on the selected features) is 0.82. Moreover, results suggest that an MLP can classify participants into two classes of NSSI Positive and NSSI Negative with 83.6% accuracy and 83.7% F1-score based on the answers to only 22 questions. Conclusion While previously psychologists used many combined questionnaires to see whether someone is involved in NSSI, via various data mining methods, the present study showed that only 22 questions are enough to predict if someone is involved or not. Then different machine learning algorithms were utilized to classify participants based on their NSSI behaviors, among which, an MLP with 10 hidden layers had the best performance.


Results
The highest accuracy and F1 score of the selected features-via the Genetics algorithm-are 80.0% and 74.8% respectively for a Random Forest algorithm. Cronbach's alpha of the online test (validation on the selected features) is 0.82. Moreover, results suggest that an MLP can classify participants into two classes of NSSI Positive and NSSI Negative with 83.6% accuracy and 83.7% F1-score based on the answers to only 22 questions.

Conclusion
While previously psychologists used many combined questionnaires to see whether someone is involved in NSSI, via various data mining methods, the present study showed that only 22 questions are enough to predict if someone is involved or not. Then different machine learning algorithms were utilized to classify participants based on their NSSI behaviors, among which, an MLP with 10 hidden layers had the best performance. Introduction Non-suicidal self-injury (NSSI) is a psychological disorder that causes the sufferer to intentionally damage the tissues of their body, which are often severe and require intensive care medicine, without intending to be dead [1]. People of any age can be involved in this disorder; however, it is more prevalent in adolescents and youngsters [2]. It is said that the onset age of the involvement in the disorder usually varies from 12 to 14 [3][4][5][6][7]. Although NSSI does not intend to end one's life, people who are involved in NSSI show a higher desire for committing suicide [5,8] as their pain tolerance increases and they can be more ruthless with themselves [9]. Hence, NSSI is the second potential leading cause of death among people who are between 15 to 19 years old, and the 10th among those between 10 to 14 years old [10]. Clearly, this disorder has many negative impacts on the sufferers' lifestyles, and therefore, identifying people who could be or become involved with NSSI can play an important role in controlling, treating, and preventing it in their older ages [11][12][13].
Although ample research has been conducted to identify people with NSSI planning and those without it, recent analyses show that our ability in doing so has been quite limited since only a few factors in people's lives have been considered, not all of them [14]. Consequently, the prognosis of NSSI is still not accurate [15]. NSSI and suicidal behaviors can be differentiated by (but not limited to) nine major factors: intent, lethality, chronicity, methods and functions, cognitions, reactions, aftermath, demographics, and prevalence [5]. While NSSI is primarily intended to eliminate 1) negative emotion, 2) body alienation (dissatisfaction), and 3) dissociation, suicide tends to be a way out of all of them at once when the injurers are not satisfied with the deceptive temporary pain caused by injuries, nor are they cured via therapies [1,5,[16][17][18][19][20][21][22][23].
NSSI is a complex concept due to various intertwined factors that contribute to its conception in one's mind, and this complexity hinders psychologists from a clear understanding of the factor(s) they should fix in an injurer's life [24,25]. This makes it more likely that the majority of the injurers will still be involved in NSSI in different ways as no one can thoroughly take into consideration their needs and motivations [24,25]. Researchers in this field have come up with a comprehensive definition of NSSI in the last 10 to 15 years; this indicates that historically, knowledge about NSSI was limited and just started to develop [26].
Despite these shortcomings, some dedicated psychological questionnaires help to identify individuals who are currently involved in NSSI (see Measures). However, it is shown that due to some latent relations among the questions, the combination of them describes involvement in NSSI better than using them separately [27]. One of the limitations the questionnaires convey is the numerous questions that a person must answer to be diagnosed. Answering a large number of questions is clearly very time-consuming and tedious so the results are likely inaccurate due to the frustration of the participants while answering them [27]. Therefore, many people answer them carelessly or inattentively. One reason that we still have to stick to the questionnaires is that some injurers hide their behaviors so that they are not objected to [7]. Therefore, if we are to identify such injurers without any questionnaires, we can only do so if they are caught injuring or they make blunders.
Hence, it is needed that academia extracts smaller subset(s) of the original questionnaire to 1) not let the participants become tired so that the answers tend to be more consistent, 2) omit the questions that explicitly ask about the participants' involvement with NSSI so that participants will not answer with lies, and 3) omit the questions that may shift the attention of psychologists from actual facts about the participants' involvement with NSSI to less important ones.
Traditionally, psychologists subjectively related one's lifestyle and mentality to their involvement with NSSI, whereas Artificial Intelligence (AI)-based approaches hold promising results while noticing NSSI involvement [28]. Furthermore, AI approaches enable us to simultaneously consider multiple different factors, such as age, gender, etc., at the same time [29]. AI and machine learning fields of study try to enable computers to mimic living species' neural systems so that they can also perceive and/or behave as living species do [30][31][32]. For example, an artificial spider walking like a real one [33], or an AI-based model locating cancer nuclei on histopathology images like an expert does [34].
AI-based diagnosis methods have seen particular attention in medical and healthcare systems as they can often become a life-saving tool for people [34][35][36][37][38][39][40]. For example, since data from Alzheimer's Disease (AD) are multimodal and time series in nature, it is a burden for specialists to handle all the data that their patients provide; thus, they may not be able to correctly decide on the outcomes [38]. Hence, state-of-the-art AI-backed approaches can be utilized to predict AD progression 2.5 years in the future, based on the multimodal time-series data [38]. In another related work, a mobile application was introduced to detect skin disease via AI [39]. The method used MobileNet-V2 and Long Short-Term Memory (LSTM) to make an algorithm that could understand complex patterns on skin and maintain stateful information for precise predictions.
As a combination of wearable sensors and social networking, a healthcare monitoring framework based on the cloud environment and a big data analytics engine is available that stores and analyzes healthcare data provided by sensors or social networks [40]. The system classifies the healthcare data to predict drug side effects and abnormal conditions in patients. That also classifies patients' health conditions using their healthcare data that are related to diabetes, blood pressure, mental health, and drug reviews. In another similar work, Bluetooth Low Energy (BLE)-based sensors are to gather users' vital signs data such as blood pressure, heart rate, weight, as well as blood glucose and to send them to smartphones. Then, AI-based methods are to provide early prediction of diabetes given the user's sensor data as input [37].
Therefore, healthcare data classification via AI provides the quickest and the most accurate automated approaches for medical centers as well as in-home diagnosis systems. These methods remove the subjectivity of specialists and try to understand and map the input values to unbiased outputs [35]. In the current case, for instance, similar methods can input data representing people's behaviors, and decide whether they are involved in NSSI or not [28]. These methods could potentially mitigate some of the issues described earlier and can be delivered relatively quickly.
Considering these, for the first time, we aimed to summarize 17 different questionnaires (of 662 questions in total) to only 22 questions (items) using feature selection and data mining techniques. Then, we build an AI approach that classifies people with either NSSI positive or negative with 83.6% accuracy by deciding on the answers to these questions. The model can be used as a decision support system (DSS) to help psychologists quickly identify their patients with acceptable accuracy. In summary, this paper has five aspects of novelty: • Reduces the 667-item questionnaire for NSSI identification to only a 22-item one.
• Proposes the fastest approach for identifying people involved in NSSI via Machine Learning and data mining techniques.
• Validates the abstracted questionnaire via two different statistical populations.
• Gives insight into utilizing Data Mining methods to abstract large questionnaires.
• Provides a public dataset for future studies in similar cases.
The remainder of the paper is organized as follows: information about participants, measures, and data preparation, as well as the utilized machine learning and data mining methods are available in Materials and methods. The results of the feature selection, abstracted questionnaire, and validation are available in Results, while the corresponding discussions are available in the Discussion section. Finally, to conclude the paper, the Conclusion section is available to summarize the paper and mention a limitation and a prospect.

Participants
The present study builds on the research conducted by Jann MacIsaac [27]. Their participants were students of the University of Windsor, Canada, recruited by Jann MacIsaac. Their data collection process began in September 2017 and ended in April 2018. Initially, 314 participants were selected for their study, but 277 participants actually were included in the final dataset. Among them, 194 people did not have NSSI disorder, 44 were NSSI-Distal, and 39 were NSSI--Proximal. All of these 277 participants' data is used in the current feature selection phase to choose the questions which are highly related to NSSI. Note that the dataset by Jann MacIsaac was not publicly available at the time of conducting this research (2019-2022). Thus, further information about it is not provided. Please contact them for more information.
After selecting features (questions) from the provided data in the first phase, an online website is created to test and validate the chosen features. Thus, 409 different participants from Azarbaijan Shahid Madani University, Tabriz, Iran are recruited. They were familiarized with the procedure in a virtual meeting and verbally agreed to participate (all data are kept anonymous with no traceable information. Those who did not consent to participate, did not provide any data; hence, the need for documented consent was not required as participants could deny participation if they did not consent to participate). All procedures performed in this study are in accordance with the ethical standards of the Institutional Review Board (IRB) of Azarbaijan Shahid Madani University and are reviewed and approved before the study began. After filtering the careless and inattentive participants, data from 245 participants is included in the second phase, whose ages vary from 18 to 35 (M = 27.88, SD = 9.26). The sample consisted of 152 females (62.04%) and 93 males (37.96%), whose demographics are presented in Table 1.

Measures
Jann MacIsaac included 19 questionnaires in their datasets, descriptions of which are as follows: • Demographic questionnaire: The demographic information of all participants including age, gender, race/ethnicity, marital status, university year of enrollment, faculty, employment status, current residence, GPA, and meditation. • Deliberate Self-Harm Inventory (DSHI): A 16-item self-report questionnaire to assess whether participants have ever performed a particular self-injurious behavior or not [6].
• Inventory of Statements About Self-Injury (ISAS): A two-part questionnaire, the first section of which assesses the lifetime frequency of 12 NSSI behaviors performed intentionally (i.e., on purpose) and without suicidal intent. Its second section assesses 13 potential functions of NSSI: affect-regulation, anti-dissociation, anti-suicide, autonomy, interpersonal boundaries, interpersonal influence, marking distress, peer-bonding, self-care, self-punishment, revenge, sensation seeking, and toughness [41].
• Risk-Taking 18 (RT-18): An 18-item questionnaire that is used to assess adults' overall risky behaviors. This questionnaire has two levels: risk-taking and risk assessment. RT-18 sums all of the answers of a patient; the higher they score, the higher their level of risk-taking or risk assessment [42].
• Difficulties in Emotion Regulation Scale 18 (DERS-18): An 18-item questionnaire designed to assess clinical problems related to emotion regulation. The questionnaire consists of six main subgroups: awareness of personal emotions, transparency about personal emotions, acceptance of personal emotions, access to emotion regulation strategies, ability to participate in purposeful behaviors when exposed to negative emotions, and ability to manage impulses generated during negative emotions [43].
• Positive and Negative Affect Schedule (PANAS): A 20-item self-report questionnaire designed to assess both positive affect and negative affect moods [45].
• NIH Flanker Inhibitory Control and Attention Test: Assesses participants' executive function and attention [27].
• Mindful Attention and Awareness Scale (MAAS): A 15-item questionnaire that assesses individual differences in the frequency of mindful states over time [48].
• Perceived Stress Scale (PSS): A 14-item questionnaire that assesses the level of tolerance of individuals (as a representative of perceived stress), when they are in a stressful situation [51].
• Social Provision Scale (SPS): A 24-item questionnaire that assesses participants' level of social understanding. It describes six different social functions or provisions that may be received from relationships with others: guidance, reliable alliances, reassurance of worth, attachment, social integration, and the opportunity for nurturance [54].
The datasets also contain validation questions (i.e., reversed items) to test the accuracy of the participants' answers. The total number of available features (questions) is 662.

Data filtering and pre-processing
In the original dataset, the optional questions, descriptive questions (i.e., "how is your hometown"), and the non-quantitative ones (such as those of NIH) are discarded. This way, the 662 features were reduced to 403. The analyses for the second phase of the test included filtering the outliers and calculating IRV (intra-individual response variability) index [57] across the 28 items (including the reversed items; see Validation and building a DSS model) to which participants responded. All of the analyses are performed using Python and libraries such as Pandas (https://pandas.pydata.org), SciPy (https://scipy.org), Matplotlib (https://matplotlib.org), Pingouin (https://pingouin-stats.org), and Seaborn (https://seaborn.pydata.org). Respondents with very low IRV values were excluded from analyses. Additionally, we excluded the participants whose differences between the reversed items and their original analogies were more than or equal to two. Lastly, we excluded any participants that took more than 20 or less than five minutes to complete the whole experiment, as our piloting suggested that the average time to complete the test was approximately 10 minutes.

Feature selection and supervisor algorithms
The popular evolutionary algorithms, such as Genetics, Ant Colony (AC), and Particle Swarm (PS) algorithms as well as Linear Support Vector Classifier (LSVC), are used to select features [58] while nine machine learning algorithms, random forests (RF), support vector machine (SVM) with regularization parameter (C) = 1.1, decision tree (DT), k-nearest neighbor (kNN), perceptron, multilayer perceptron (MLP) with 10 hidden layers, the neurons of which are 80% more than the input size, linear discriminant analysis (LDA), AdaBoost (AB), and Long-Short Term Memory (LSTM) network, were to supervise them. Other default parameters are not changed. Lastly, the output neurons of each model conform to the number of classes (see Feature selection for more information).
The former algorithms (feature selectors) select varying subsets of questions in each epoch and have them learned by the latter algorithms. Supervisors (latter algorithms) are trained on the answers to those questions learning to map them to a particular NSSI class. It is clear that if the supervisors achieve acceptable evaluation results on a particular subset, the items in the subset highly correlate with NSSI; and thus, can be excellent representatives of NSSI involvement [58].
The LSTM model has four memory layers with 22 inputs and 22 outputs each, followed by three fully-connected layers with 128, 256, and 64 neurons in their hidden layers. Its last layer has two or three neurons (based on the output classes; see Feature selection for more information). Except for the last layer which has the Soft-Max activation function, others have ReLU.

Evaluation metrics
To evaluate the efficiency of the selected features, the Precision, Recall, F1-score, and Accuracy metrics are used, which are extracted as follows [59,60]: In the above relations, |C| represents the total classes, which in our case, either C = {No NSSI, NSSI Distal, NSSI Proximal} or C = {NSSI Neg., NSSI Pos.} (see Feature selection for more information). i shows the class, the metrics of which are being calculated. TP i indicates the number of cases that belong to the i th class and are classified correctly, TN i indicates the number of cases that do not belong to i th class and were not classified as class i either. In contrast, FP i indicates the number of cases that did not belong to class i but were incorrectly classified as class i, and finally, FN i indicates the number of cases that belong to class i but were classified incorrectly. Precision shows how much we can rely on the model when it classifies participants, while recall measures the ability of the model to find all the correct cases in a given dataset. Finally, the F1-score is the weighted harmonic mean of the precision and recall, and it is beneficial while finding the best trade-off between the two quantities [61]. Note that as the approach is a classification problem, statistical tests do not increase the analytical deduction of the study as the lists being tested are comprised of only two or three different discrete numbers. Statistical tests are informative in regression problems that the values are continuous. Hence, the values in the current task can never meet normality (as required by the Central Limit Theorem) to be evaluated via statistical tests.

Feature selection
First, we tested our approach on three classes (No NSSI, NSSI Distal, and NSSI Proximal) without and with data augmentation. We initially tested a different number of features (questions), but only the best and second-best performing ones are reported in Table 2. Also, among the mentioned feature selectors, only Genetics and LSVC had promising results; therefore, others are omitted from comparisons. As it is evident, the DT with the genetics algorithm obtained the highest overall efficiency whose accuracy, precision, recall, and F1-score are 58.9%, 50%, 47.5%, and 48.7% respectively. To improve the results, we augmented data using SMOTE synthetic method [62] then selected features, and illustrated its results in Table 3. In this case, the most efficient result was that of the genetics algorithm with an SVM model, the accuracy, precision, recall, and F1-score of which are 71.4%, 55.1%, 56.5%, and 55.7% respectively. According to the F1-scores, all of these approaches are behaving randomly and are not reliable at all.
To overcome the random behavior, the three classes are reduced to only two, being NSSI Positive (the combination of NSSI Distal and NSSI Proximal) and NSSI Negative. Note that the results of future experiments are only reported based on the genetics algorithm since it was more accurate than LSVC (and others). Table 4 compares the models' efficiencies considering the initial features and selected features. As is evident, RF obtained the highest efficiency with an accuracy, precision, recall, and F1-score of 80.0%, 73.1%, 76.5%, and 74.8% respectively. To see how data augmentation methods contribute to the accuracy, Table 5 compares the effects of ADASYN, SVM-SMOTE, and SMOTE data augmentation methods, all of which led the feature selector to select 45, 34, and 19 questions respectively. It is evident from the table that ADASYN with DT achieved an accuracy, precision, recall, and F1-score of 81.6%, 81.0%, 80.2%, and 80.6% respectively, whereas those of SVM-SMOTE with LDA were 76.2%, 75.8%, 73.2%, 74.5% respectively; those of SMOTE with LDA were 71.4%, 71.2%, 68.5%, and 69.8%.

The abstracted questions
According to the results in Tables 4 and 5, it is possible to reduce the original questionnaire to a 19-or 22-item one. As the latter showed a better performance compared to the former, we selected the latter as our abstracted questionnaire. Although the former has fewer items, the difference (three questions) is not significant enough to lead the participants to answer carelessly. Hence, we chose performance over conciseness. Although there are statistical ways to see which questionnaire is better, since machine learning algorithms can consider relations between items better than the statistical methods, we opted for the modern (machine learning) solution for this task and thus relied on the provided evaluation metrics. The items in the abstracted questionnaire are selected from nine questionnaires: Demographic Information Questionnaire, TAS [44], PANAS [45], UPPS-P [46,47], PSS [51], CD-RISC [53], SPS [54], ICSRLE [55], SCS-SF [56]; and are available in Table 6. Although the set of 45 items better represents NSSI behaviors (according to Table 5), its participants may still respond carelessly due to 45 tedious questions; hence this one is also discarded. Despite showing a 74.8% F1-score, the 22 items are still more accurate than a subjective psychologist who is exposed to fatigue and has to assess many participants who might not be honest with their responses in a large and tedious questionnaire. Since the previous methods divided data to test splits to select features, it is not clear whether the algorithms are under-or over-fitted. Therefore, we used 5-fold cross-validation on the 22 questions with the three best-performing algorithms (also with and without data augmentation) to assure consistency and presented the results in Table 7. Evident from the  (particularly Table 4) as the difference between them is marginal. That is, the best-performing algorithm is still RF whose evaluation results are similar to the previous table that indicates the significant consistency of the selected features.

Validation and building a DSS model
To validate the abstracted questionnaire, we designed an online website to ask participants to provide data. The website had a description section to make participants familiar with its questionnaire and its aims. We added six reversed items between the abstracted 22 items to assure the answers' consistency and to be able to filter careless participants. Finally, the Cronbach alpha of the pruned responses is 0.82. Based on the pruned data, we train eight machine learning models which are to classify participants into two classes (NSSI Positive and NSSI Negative) based on their scores (answers) to each question. The reversed items are not used in this phase to see how well the 22 items can represent NSSI involvement. The performances of the models are available in Table 8 and are calculated using 5-fold cross-validation. Note that we tested the four data augmentation algorithms for each of the methods in Table 8; however, only the best-performing one is reported per each algorithm. It is evident that overall, its results are consistent with those in Tables 4  and 7 since the three tables have negligible differences from each other. This endorses that the selected features are consistence in both statistic populations (the initial dataset and the collected one) and can represent NSSI involvement among different nations. However, it was interesting to see the MLP algorithm could outperform its previous model (that in Table 7) and rank first with an accuracy, precision, recall, and F1-score of 83.6%, 83.6%, 83.9%, and 83.7% respectively. The MLP model has 10 hidden layers and 40 neurons in each layer. Moreover, the RF algorithm shows a very narrow difference from that in Table 7, indicating its overall stability through different datasets. Since MLP outperformed other methods, only its results are visualized in the following. Thus, Fig 1 shows

Discussion
NSSI is a psychological disorder that a person intentionally harms themselves without intending to die [1]. In the present study, an attempt has been made to provide a simpler and more efficient method for predicting NSSI disorder using several basic classification algorithms. For this purpose, after conducting various experiments using data mining and feature selection, an abstracted questionnaire was extracted which is comprised of only 22 questions, only three percent of the initial 662 questions, and an abstract of 17 different questionnaires. Answering these questions only takes about five to six minutes (without including reversed items) which potentially increases confidence and accuracy since participants are not bored while answering. Consequently, a machine learning algorithm can classify a participant with 83.6% accuracy. Our study lets psychologists find people involved in NSSI and start psychotherapeutic treatments as soon as possible. Another convincing reason for considering this study worldwide would be its independence on the questions that potentially could be answered with lies.
Some paramount examples of such questions would be those asking participants' involvement in deliberate self-injuring. The majority of people who are involved in NSSI are shown to be hiding this trait as they wish not to be objected to. Hence, the inclusion of such questions may alter the results and thus, they had better not be included.
Here, we mainly used traditional machine learning methods instead of more complex deep learning-based ones since the former methods are shown to be more robust when they are to consider statistical aspects or relations between different components of the input data on the one hand, and can perform better than the latter approaches when the data is sparse and the inter-item variety is not significant on the other hand [53][54][55][56][57][58][59][60]. Additionally, the used data in this approach are self-explanatory and do not need feature extraction phase(s). Therefore, there is no necessity to use deep learning methods as they usually rely on larger datasets for better performance. However, since the questions were shown to the participants in a static order, we thought that using an LSTM could be beneficial as it can relate the bias of previous questions to the answer of the current one. Yet, results suggested that it was not the case in this approach. We did not also use CNNs because they are inspired by neural behaviors in the eyes' receptive field and are usually well-performed on visual tasks, such as image detection, image classification, etc. [63]. Additionally, they multiply adjacent values of the input to form filters; thus, the importance of the values will be diminished if used in cases like the one in the present study [63].
As seen in Tables 2 and 4, the inclusion of the whole dataset even confuses a machine learning method. Therefore, psychologists, who are exposed to heavy workloads and often fatigue, are not immune to the confusion and thus, may decide subjectively. However, using feature selection and machine learning algorithms, we proved that the inclusion of a more refined set of questions can be less confusing and aid people in making more accurate decisions. Although some may say that the achieved level of accuracy in the proposed research is still far from an accurate decision made by a psychologist, considering the delivery time and convenience of the provided abstracted questionnaire, it suffices the needs of the current state of psychological expectations. Furthermore, it is still more precise than a subjective psychologist who is exposed to fatigue and has to assess many participants who might not be honest with their responses in (a) large and tedious questionnaire(s).
Three pioneering studies used data mining techniques to predict NSSI, two of which aimed to identify subgroups of individuals who engaged in NSSI by identifying splits on related variables [28]. The first used a machine learning technique to examine splits in the number of NSSI acts during the previous year as predicted by participant-reported psychological difficulties. Results demonstrated significant splits between zero and one-or-more past year NSSI acts on the one hand, and between five and six-or-more past year NSSI acts on the other hand. This suggested that participants reporting six or more past-year NSSI acts may represent a more severe group of self-injurers [1]. Another study built on the previous one and examined splits in NSSI behavior age of onset predicted by prior Self-Injurious Thoughts and Behaviors (SITB), including NSSI characteristics (i.e., number of NSSI-related hospital visits, NSSI frequency, etc.), suicidal ideation, suicidal planning, and suicidal attempts. Results suggested there was a potential subgroup in the data representing those with an earlier age of onset (i.e., approximately 12 or younger); this subgroup reported a greater number of NSSI methods, NSSI frequency, and NSSI-related hospital visits, in addition to an increased likelihood of having suicidal planning [71]. The final study employed two machine learning techniques to identify important indicators of NSSI frequency, both explaining a significant proportion of variance in NSSI frequency (R 2 = 0.48 and 0.46, respectively). Models indicated that the number of NSSI methods was the most important indicator of lifetime NSSI frequency; after removing the number of methods from the models, suicidal planning and depressive symptoms emerged as the most important in the prediction of NSSI frequency [3].
Previously, state-of-the-art studies provided time-series analysis to determine whether a person is involved in NSSI or not. For instance, within a sample of 1,021 high-risk self-injurious and/or suicidal individuals, Huang et al. [76] examined the accuracy of three different complex model types in predicting NSSI across 3, 14, and 28 days. In another study, Marti-Puig et al. [77] built a mobile application to collect data so that later they could classify NSSI in young adults focusing on their emotions only. After the data collection phase, they used the data as a time series to test one's involvement in NSSI. It is clear that such approaches require multiple records of data for their decision-making process, none of which are convenient when injurers are to be identified in only one and the first session of their therapies. Moreover, the former approach only achieves an 84.0% F1-score after processing data entries of the last three days and the latter achieves merely a 22.9% F1-score after processing the data of the last 15 days of a person. Additionally, they ask the user to clarify whether they have been involved in NSSI as the ground truth data. Yet, users may deliberately lie about this question in particular, if they do not want to be objected as mentioned previously. However, our method achieves an 83.7% F1-score after processing only 22 items without requiring the patient to clarify their NSSI behaviors. These automatically increase the reliance on our approach and make it one of the most convenient methods of identifying those involved in NSSI.
Some other successful machine learning-based methods are to provide insight into key factors when NSSI involvement is being decided by a psychologist. For example, Gradus et al. [72] developed gender-stratified classification trees and random forests using 1,458 predictors, including demographic factors, family histories, psychiatric and physical health diagnoses, surgery, and prescribed medications. They found that SUD, prescribed psychiatric medications, previous poisoning diagnoses, and stress disorders were important factors for predicting suicide attempts among men and women. Wallace et al. [67] in 2021 tested classification trees that evaluated 298 potential correlates of NSSI and suicidal ideation across self-identified women and men. Psychopathology, poorer psychological well-being, and other SITBs emerged as important correlates for all participants. Trauma, disordered eating, and heavy alcohol use were salient among women, whereas alcohol use norms were important correlates among men. In a similar study, Yang et al. [78] in 2022 proposed an SVM model which deduced adolescents' gender, paranoid and histrionic personality traits, suffring physical abuse in childhood, emotional non-acceptance, and education level were associated with an increased risk of NSSI. These may alert psychologists to pay more attention to the mentioned factors. Nevertheless, solely outlining key factors may not be reasonable enough to make psychologists opt for such methods and decide on a diagnosis. Perhaps, using the factors all together to also classify the trait would be more acceptable among psychologists, which requires more complicated processes on data. Automated methods, on the other hand, can quickly do the computation to provide an outcome on the diagnosis. This also contributes to the convenience of such methods while saving some time and energy for the psychologist.
Although our baseline study is conducted on a relatively small statistical population without a broader validation, studies involving self-reports that lack additional objective components still suggest promising aspects of novelties for future research. For instance, predicting a marker that had previously not been researched extensively, Twivy et al. [79] found that differences in affective flexibility towards emotional stimuli may be a positive indicator of anxiety. Soroski et al. [12] built an online website to collect the voices of their participants to propose an Alzheimer's detection system. Sanders and Nosofsky [11] crowdsourced participants to reconstruct psychological feature space for natural object detection in machine learning applications. That said, a need for a universal measure free from self-reporting constraints is demonstrated, both for the reliability and validity of current and future research. This can be investigated using such pilot studies which may lack additional objectiveness yet fillip the stagnant realms of research.
Currently, other means of determining human cognition and emotions are under review to act as possible alternatives to subjective self-reports. A report studying adults with schizophrenia spectrum disorders found important indicators in these models include duration of illness, number of hospitalizations, emotional and physical abuse in childhood, as well as drug usage or abuse [80]. Furthermore, computational models are concurrently being explored as a possible avenue to examine gender-specific risk profiles for suicide attempts potentially providing an objective lens through which to take into consideration SUD treatment, prescribed psychiatric medications, previous poisoning diagnoses, and stress disorders [72].
Future studies may aim to recruit non-university participants from other countries, although the strength of our study was the generalizability of the sample as we recruited NON-WEIRD (Western, Educated, Industrialized, Rich, and Democratic) [81] participants. Moreover, despite reducing the number of questions, the system still requires around five minutes (reversed items not included) of a person so that it can decide on a result. Therefore, in the future, other cognitive constructs could be used to make the NSSI classification more automated, such as via visual tasks (i.e., eye tracking, visual search, etc.).

Conclusion
People who are involved in NSSI disorder tend to deliberately damage their body tissues to dispel the negative psychological effects of their lives. Since they may hide their behaviors, their parents or caretakers can only notice their behaviors if they catch the injurers while injuring themselves. Another alternative is via dedicated questionnaires. However, they are boring and open to errors and lies. Therefore, we abstracted an original questionnaire of 662 items to own only 22 questions via utilizing data mining techniques, and trained machine learning algorithms to classify participants into two classes of NSSI Positive and NSSI Negative with 83.6% accuracy. Since answering these questions only takes about five to six minutes, the accuracy and confidence will be automatically increased.
That said, the limitation of the current study is that it has been evaluated on only two statistical populations. Further evaluation requires recruiting more participants from different nations to examine the generalizability of such approaches. As a future study, the approach can be further expanded by cognitive models. That is, cognitive models may be able to map particular aspects of complex human behaviors (i.e., their walking style, pauses in their speech) to NSSI. Hence, to leverage Artificial Intelligence, academia may further examine such models for more ubiquity in different places.